Feature Selection Guided by Structural Information
نویسندگان
چکیده
In generalized linear regression problems with an abundant number of features, lasso-type regularization which imposes an `-constraint on the regression coefficients has become a widely established technique. Crucial deficiencies of the lasso were unmasked when Zhou and Hastie (2005) introduced the elastic net. In this paper, we propose to extend the elastic net by admitting general nonnegative quadratic constraints as second form of regularization. The generalized ridge-type constraint will typically make use of the known association structure of features, e.g. by using temporalor spatial closeness. We study properties of the resulting ’structured elastic net’ regression estimation procedure, including basic asymptotics and the issue of model selection consistency. In this vein, we provide an analog to the so-called ’irrepresentable condition’ which holds for the lasso. An oracle property is established by incorporating a scaled `-constraint. Moreover, we outline algorithmic solutions for the structured elastic net within the generalized linear model family. The rationale and the performance of our approach is illustrated by means of simulated and real world data. keywords: generalized linear model, regularization, sparsity, p n, lasso, elastic net, random fields, consistency, epiconvergence, model selection, signal regression.
منابع مشابه
Feature Selection in Structural Health Monitoring Big Data Using a Meta-Heuristic Optimization Algorithm
This paper focuses on the processing of structural health monitoring (SHM) big data. Extracted features of a structure are reduced using an optimization algorithm to find a minimal subset of salient features by removing noisy, irrelevant and redundant data. The PSO-Harmony algorithm is introduced for feature selection to enhance the capability of the proposed method for processing the measure...
متن کاملFeature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine
Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods. In filter methods, features subsets are selected due to some measu...
متن کاملA New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...
متن کاملOnline Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملA hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کامل